HILBERT BASES FOR ORTHOGONAL ARRAYS 



ENRICO CARLINI AND GIOVANNI PISTONE 

Abstract. In this paper, we relate the problem of generating all 2-level orthog- 
onal arrays of given dimension and force, i.e. elements in OA(n,m), where n is 
the number of factors and m the force, to the solution of an Integer Programming 
problem involving rational convex cones. We do not restrict the number of points 
in the array, i.e. we admit any number of replications. This problem can be theo- 
retically solved by means of Hilbert bases which form a finite generating set for all 
the elements in in the infinite set OA(n,m). We discuss some examples which are 
explicitly solved with a software performing Hilbert bases computation. 



1. Introduction 

We shall investigate orthogonal arrays with 2 levels coded —1, +1. In this paper, a 
full factorial design D{n) is the Cartesian product { — 1, +l} n and a multi-subset F of 
D(n) is called a fraction. If D(n) is embedded in the afhne space Q n , it is the set of 
solutions of the syst em of polynomial equation s x\ — 1 = 0, i — 1, . . . , n. In a series of 
paper, starting with Pistone and Wyrrnl ( 19961 ). this approach has been systematically 
developed, mainly because of the availability of algorithms and software able to solve 
systems of polynomial equations with rational coefficie nts. The state of the art on 
year 2000 in discussed in the book lPistone et al.l (120011 ). 

In modern geometric terms, a fraction is a non-reduced 0-dimensional scheme sup- 
ported on a subset of D{n). The geometry of such schemes has probably many inter- 
esting thing to say to statistical design theory, but this will be the subject of further 
work. Here, we take a simpler approach, consisting in the remark that a fraction with 
replicates is fully described by a functions R : D(n) giving the number of replications 
R(a) = 0, 1, 2, . . . of each point a G D(n). If the points of the full factorial design are 
listed in some order, then R is a vector with non negative integer elements. A fraction 
of D(n) such that all of its m-dimensional orthogonal projections are replications of a 
full factorial designs of the same cardinality (as multi-sets) is called an n-dimensional 
orthogonal array of force m. We denote with OA(n, m) the set of such fractions. 

One possible algebraic approach to the problem of describing the elements of 
OA(n, m) uses polynomial counting functions, i.e. polynomial functions that take 
non negative integer values on D(n). A polynomial counting function on D(n) that 
takes only the value and 1 is called an indicator polynomial function. An indicator 
polynomial functi on is associated with a fraction with n o repli cation s. Th is approach 
was developed in IFontana et al.l (119961) . iFontana et al.l (|2000l). lYg (120031 ) for 2-level 
design and generalized in lYd (l2004 ). |Pistone and Rogantinl tom . 

In order to implement this kind of study, we need to describe the ring of Q-valued 
functions over D(n). This can be done by means of standard techniques in Algebraic 
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Statistic, see e.g lPistone et al.l (120011 ). Let Q be the rational number field and consider 
the affine space Q n . By realizing D(n) as a subset of Q n , we can describe it using alge- 
braic equations. More precisely, D{n) is associated to the ideal I C R — Q[xi, . . . , x n ] 
generated by 

x\ — 1 = 0, % — 1, . . . , n. 

The ring of function over D(n) is then the quotient ring R/I. As a Q-vector space, 
R/I is generated by the following set of monomials: 

{l a :a=K..,a„)Gi = {0,l} 2 } • (1) 

The capital X is intended to distinguish between the indeterminate x and the corre- 
sponding function defined on D(n), i.e. X a (a) = YYi=i a T ■> a e D(n). 
Given a fraction F of D(n), its counting function can be written as 

Rf = ^ ^ ba-X a 
a 

for a unique choice of constants 6 a 's. Notice that Rp uniquely determines the fraction. 

The previous setting is fully general and applies to any finite subset of the affine 
space Q n . However, the case we are discussing is quite special, essentially because the 
mapping a \— > P3" =1 ( — l) a * is a C-representation of L as the additive group mod 2. 
Then, the monomial functions in the vector basis in ([1]) are orthogonal: 



E X a (a)X?(c 

a<=D(n) 



if a /3, and 
if a = (3. 



In particular, ^ agfl , , X a (a) = 0, a E L = L \ {0}. Then, the projection of the 
fraction F on the factors with indices in J C {l,...,n} is defined on D(J) with 
counting function 

R M) = E E ^(a^x^cfej) = e x ^) aj 

bjeD(J c ) a=aj+f3j£L=LjxLjc ocjeLj 

The conclusion is that condition F G OA(n, m) can be easily expressed in terms of 
the fee's. To see this, fix a subset of indexes J C {1, . . . , n} of cardinality m. As i?j 
has to be constant to be the counting function of a replication of a full factorial,then 
the orthogonality condition is expressed imposing the vanishing of all fe's other than 
bo in this expression for each such J. Namely we get conditions: b a = 0, a ^ has 
no more than m non-zero component. 

A counting polynomial R is a polynomial which has non negative integer values on 
D(n). An algebraic way to say that is the following. Let f n (x) = x(x — 1) ■ ■ • (x — n) 
be the factorial polynomial of order n. A polynomial R takes values in Z + if all but a 
finite number of the polynomials f(R n (x)) are identically zero on D(n). In particular, 
the values are and 1 if and only if R(R — 1) = on D(n), or J2 a +f3=jb a bi3 = 
for all 7 G L. Thus we have a method to find the elements of OA(n, m) without 
replications, namely we have to solve in the 6 Q 's the system 



b a = l<E«< m 
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This was done in Font ana et al. d2000h usine; the software CoCoA. For future reference 
we reproduce here the results for the case OA(5, 2). This method founds 1054 frac- 
tions, that were classified as follows. Below, Cj denote the term X a with a(i) = 1 
for i e / and zero otherwise. 

(1) 92 are regular fractions; there exist 3 classes of equivalence for change of signs 
and permutation of factors. Among those, 

(a) 32 have b = 1/2, i.e. 16 points: 

(i) 2 have counting polynomial of the form: 

2 ' n^ijhkl i 

and the fraction has resolution 5 and projectivity 4. 

(ii) 10 have counting polynomial of the form: 

1 1 

2 + 2' 

the fraction has resolution 4 and projectivity 3; moreover it fully 
projects on the {j/iA;Z}-factors, on the {i/ifcZ}-factors, on the {ijkl}- 
factors and on the {ijhl}-t 'actors. 

(iii) 20 have counting polynomial of the form: 

\ + \c, lh ; 

the corresponding fraction has resolution 3 and projectivity 2; more- 
over it fully projects on the {j/ifc/}-factors, on the {z/tA;Z}-factors 
and on the {r/'fc/j-factors. 

(b) 60 have bo — 1/4 (8 points) and resolution 3. This is an unique class of 
equivalence for change of signs and permutation of factors. A counting 
polynomial is of the form: 

1 1^ 1^ 1 „ 

- + ~^ijh + -Chki + -L'ijki ■ 

(2) 940 are non regular fractions with resolution 3: 

(a) 192 have b = 3/8 (12 points); this is an unique class of equivalence for 
change of sign and permutation of factors. The counting polynomials 
have all the coefficients of the interactions of order 3 and 4 equal, in 
absolute value, to ~ and the coefficient of the interaction of order 5 equal 
to 0. 

(b) 520 have &o — 1/2 (16 points); there exist 4 classes of equivalence for 
change of sign and permutation of factors: 

(i) a counting polynomial is: 

2 + + + ^Cijta - -Cijhki ; 

the corresponding fraction fully projects on the j/ifcZ- factors, on the 
ihkl- factors and on the r/'/cZ-factors. 

(ii) a counting polynomial is: 

X + V +V +V V • 

2 + 4°u'fc _ ' 
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the fraction fully projects on the jhkl-factors and on the ihkl- 
factors. 

(iii) an counting polynomial is: 

the fraction fully projects on the jhkl-faetors and on the i/ifcZ- 
factors. 

(iv) a counting polynomial is: 

2 + jCijh + -^Cijk + -Gift/ — -Ciki ; 

the fraction fully projects on the jhkl-i 'actors. 

(c) 192 have bo = 5/8; they are the complement of the fractions with b = 3/8. 

(d) 60 have bo = 3/4; they are the complement of the fractions with b = 1/4. 

See the original paper for more details. Discarding the 192 + 60 = 252 fractions that 
are complement of a smaller one, in conclusion there are 1054 - 252 = 802 fractions, 
60 with 8 points, 192 with 12 points, 552 with 16 points. 

2. Cones and semi-groups 



We follow the presentation of ISchrijverl (119861 ). Let C be a (rational) convex cone 



in Q n , i.e. C C Q n and x,y E C, A, fi E Q + imply Xx + \xy E C. A convex cone is a 
convex subset of Q n , so we say simply cone. The cone C is polyhedral if the following 
two equivalent conditions are satisfied: 

(1) C = {x E Q : Ax > 0}, for some matrix A E Q fc ' n ; 

(2) C = {XiXi + • • • + X m x m : Ai, . . . , X m E Q+}, for some finite set of vectors 
Xi, . . . , x m E Q m , called generating set of C . 

Notice that the direction of the inequality in Item [1] is unessential because we can 
change A to —A, and that we could have equality, represented with a 2k x m matrix 
with blocks A and —A. The cone is said to be pointed if C fl —C = {0}, which in 
turn is equivalent to Ay = implies y = 

Let O = C H Z n be the set of lattice points of the cone C. Then O is a sub-semi- 
group of Z n because r,s E O implies r + s E O. If Xi, . . . , x m is a set of generators of 
the cone C, than each lattice point r E O can be written as r = X\X\ + ■ • • + X m x m 
with rational coefficients. 

A Hilbert basis of O = C fl Z n is a finite set of elements r 1; . . . ,77 such that any 
other ele ment of O is line ar combination with non-negative integer coefficients of the 
rj's, see ( Schrijver . 19861 . Sec. 16.4). In other words, a Hilbert basis is a generating 



set of O as a semi-group. 

Theorem 2.1 (Existence and uniqueness of Hilbert bases). Each rational polyhedral 
cone is generated by an Hilbert basis of its lattice points. If C is pointed, then there 
exist a unique inclusion-minimal Hilbert basis. 

The previous Theorem does not apply to more general sub- semi-groups of Z n . In 
fact, the following was proved by Hemmecke and Weissmantel. 

Theorem 2.2. A sub-semi-group S of Z n has a finite generating set if and only if 
the cone of S is polyhedral. 
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We are going to use the previous theory in a special case. We are given a rational 
matrix Mi G Q N ' k and the cone of non-negative y G such that y t M 1 = 0. This 
is a polyhedral cone defined by M\y = and Iy > 0. Notice that the vector space 
ker M* has a linear basis of integral vectors. Moreover, if l*Mx = 0, then there exist a 
linear basis of non-negative integral vectors. This basis allows us to obtain all lattice 
points using rational coefficients while the Hilbert basis, being bigger, does the same 
using non-negative integer coefficients. 



3. Fractions and semi-groups 

An approach to the generation of all OA(n, m) involves Hilbert bases of a semi- 
group. Let R(a), a G D{n) be a vector of replicates, R(a) G Z + . As an integer valued 
polynomial of the ring Q[x±, . . . , x n ], reduced on the ideal I(n), R can be written on 
D{n) uniquely as 

R(x) = b a x a , 

L = {0, l} n . Notice that, for a fixed a G L, we have 
R{a)X a {a) = 2 n -b a . 

a£D(n) 

Thus we have a linear system of equations in the vector of replicates R — (. . . , F(a), . . .) 
as a varies in L. Assume now we have a subset L\ C L = L \ 0, and we want b a = 
for all a G L\. Then we are looking for the non negative integer solutions of the 

system 

R(a)X a (a) = a G L x (2) 

a<=D(n) 

The model matrix restricted to Li, namely Mi = (x Q (a)) ag £i( n ) jae L 1 , is the matrix of 
coefficients in Equation The non negative integer solutions of the integer linear 
system of equations R* ■ M = are the lattice points of the cone C = kerM^. Hence 
we look for generators of the semi-group O = C fl Z" . 

A Hilbert basis of O is a finite and minimal set of elements ?r, . . . , ri such that any 
other element of C + is linear combination with non-negative integer coefficients of 
the rj's. The list L\ consists of all interactions of order between 1 and m, gives us 
an element in OA(n, m) and the integers combination Yl n i r i> ^ ne fraction obtained 
taking the union of the corresponding fractions n.j times each, is the generic element 
of OA(n, m). Notice that the matrix M\ has entries equal to either 1 or —1, hence 
kerM^, as a Q-vector space has a basis of 2 n — non negative integer vectors. 
These basis elements produce elements in OA(n, m) which are independent, i.e. one 
does not decompose as union of the others. However, in general this is not a Hilbert 
basis as these fractions can be decomposed as union of other elements in OA(n, m). 

Algorithms for computing Hilber t bases are implemented in specialized software , 
namely 4ti2, see iHemmecke et al. and CoCoA, see ICoCoATeaml (jno datd ). 



Using the former we present some examples of interest in order to discuss the use of 
Hilbert bases in designs theory. We remark that the com putation of Hilber t bases 



with 4ti2 uses the Project-and-Lift algorithm described in iHemmeckd (120061 ). 
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generated with the software R using the function computing the generalized Kronecker 
product of two arrays. Then, the treatment points and the interactions are listed in 
right-to-left lexicographic order. The a's of the interactions of order 1,2,3 are printed 
in the order: 00001, 00010, 00011, 00100, 00101, 00110, 00111, 01000, 01001, 01010, 
01011, 01100, 01101, OHIO, 10000, 10001, 10010, 10011, 10100, 10101, 10110,11000, 
11001, 11010, 11100. Treatment values are shown; column names are not printed to 
save space. 

One run on my slow laptop of the 4ti2 function hilbert has taken 0.05 seconds 
to find the 28 elements in the minimal Hilbert basis of OA(5,3), see Table [2j There 
are 12 arrays with 16 treatments an no replications; there are 16 orthogonal arrays 
with 24 points, all with replications. 

4.2. OA(5, 2). In this case the 32 x 15 matrix Mi is given in Table [31 In this case the 
run of hilbert took 1008.97 seconds, and the number of elements of the Hilbert basis 
is 26142. Some of the arrays are quite unusual. Table H] shows the distribution of the 
support, the totals and the maximum number of replication in each element of the 
Hilbert basis. Let us consider first the 60+192+162 = 414 elements of the Hilbert 
basis with no replications. If we compare this table with the results mentioned in 
Section 1, we see that the cases with 8 or 12 points are the same number, and 
actually the same fractions. In the case with 16 points, 552 - 162 = 390 are missing. 
This means that in this case all the extra fractions are actually the union of two 
fractions with 8 points and disjoint support. To check this, we find that 450 couples 
of elements of the basis with 8 elements have disjoint support. 

4.3. OA(6,2). This case appears to be not solvable with this technology because of 
the long computational time. The authors of the software we are using announced 
the future release of an option to the program hilbert that computes only the 0- 
1 vectors. In our case, this would compute only the OA's with no replication. The 
examples above show why we expect this number to be much smaller than the number 
of elements in the full Hilbert basis and, as a consequence, to be computable in 
practical times. A different option consist in the search of a basis of the OA(6, 2) 
such that prescribed entries are zero. The assignment to zero of elements could be 
done by random sampling, or in a systematic way. As an example, if we force ten 
zeros to the first entries with the ordering of treatments as in the previous examples, 
then the Hilbert basis has 6 elements. 

5. Discussion 

The case OA(6, 2) could not be computed because of the excessive computational 
time. As many other general methods, this one is currently restricted to small ex- 
ample, and has to be considered mainly of conceptual interest. The main problem 
is the big number of fractions with replications bigger than 1. A fraction with no 
replication can be generated only by elements of the basis of the same type, then the 
algorithm could possibly be modified to compute only the subset generating fractions 
without replications bigger than 1. 

Constrains of the form XlaeD -^( a ) = T for a given total T, or R(a) = 1 for some 
a G D(n), change the character of th problem and require different algorithms and 
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Table 4. Support (supp), total, maximum replication (maxrep), for 
the Hilbert basis of the OA(5, 2) 





8 


12 


16 


20 


total 
24 


28 


32 


36 


1 


maxrep 
2 3 


4 


supp 

8 


60 























60 











11 





32 























32 








12 





192 




















192 











13 











480 

















480 








15 














1920 

















1920 





16 








162 








2624 


5760 


2880 


162 





8064 


3200 


1 O 

18 














5760 


5760 











5760 


5760 





19 











480 

















480 








21 























32 





32 








maxrep 
1 


60 


192 


162 

























2 





32 





960 


5760 








32 










3 














1920 


8064 


5760 













4 

















320 





2880 











Fontana R., Pistone G. and Rogantin M.P. (2000). Classification of two-level factorial 

fractions, J. Statist. Plann. Inference, 87, 149-172. 
Hemmecke R. (2006). On the computation of hilbert bases and extreme rays of cones, 

Tech. Rep. CO/0203105, arXiv. 
Hemmecke R., Hemmecke R. and Malkin P. (2005). 4.U2 version 1.2 — computation 

of Hilbert bases, Graver bases, toric Grobner bases, and more, Available at 

www.4ti2.de. 

Pistone G., Riccomagno E. and Wynn H.P. (2001). Algebraic Statistics: Computa- 
tional Commutative Algebra in Statistics, Chapman&Hall. 

Pistone G. and Rogantin M. (2006). Indicator function and complex coding for mixed 
fractional factorial designs (revised 2006), Tech. Rep. 17, Dipartimento di Matem- 
atica, Politecnico di Torino. 

Pistone G. and Wynn H.P. (1996). Generalised confounding with Grobner bases, 
Biometrika, 83, 653-666. 

Schrijver A. (1986). Theory of linear and integer programming, Wiley-Interscience 
Series in Discrete Mathematics, John Wiley & Sons Ltd., Chichester, a Wiley- 
Interscience Publication. 

Ye K.Q. (2003). Indicator function and its application in two-level factorial designs, 
Ann. Statist., 31, 984-994. 

Ye K.Q. (2004). A note on regular fractional factorial designs, Statistica sinica, 14, 
1069-1074. 

DIMAT Politecnico di Torino 



DIMAT Politecnico di Torino 

E-mail address: enrico.carlini@polito.it 

E-mail address: giovanni.pistone@polito.it 



